Influence of Part-of-Speech and Phrasal Category Universal Tag-set in Tree-to-Tree Translation Models
نویسندگان
چکیده
Tree-to-tree Statistical Machine Translation models require the use of syntactic tree structures of both the source and target side in learning rules to guide the translation process. In order to accomplish the task, available treebanks for different languages are used as the main resources to collect necessary information to handle the translation task. However, since each treebank has its own defined tags, a barrier is inherently created in highlighting alignment relationships at different syntactic levels for different tag-sets. Moreover, these models are typically over constrained. This paper presents a unified tagset for all languages at Part-of-Speech and Phrasal Category level in tree-to-tree models. Different experiments are conducted to study for its feasibility, efficiency, and translation quality.
منابع مشابه
Real-time quality monitoring in debutanizer column with regression tree and ANFIS
A debutanizer column is an integral part of any petroleum refinery. Online composition monitoring of debutanizer column outlet streams is highly desirable in order to maximize the production of liquefied petroleum gas. In this article, data-driven models for debutanizer column are developed for real-time composition monitoring. The dataset used has seven process variables as inputs and the outp...
متن کاملStudying impressive parameters on the performance of Persian probabilistic context free grammar parser
In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...
متن کاملPrediction of melting points of a diverse chemical set using fuzzy regression tree
The classification and regression trees (CART) possess the advantage of being able to handlelarge data sets and yield readily interpretable models. In spite to these advantages, they are alsorecognized as highly unstable classifiers with respect to minor perturbations in the training data.In the other words methods present high variance. Fuzzy logic brings in an improvement in theseaspects due ...
متن کاملComparison of gestational diabetes prediction with artificial neural network and decision tree models
Background: Gestational diabetes mellitus (GDM) is one of the most common metabolic disorders in pregnancy, which is associated with serious complications. In the event of early diagnosis of this disease, some of the maternal and fetal complications can be prevented. The aim of this study was to early predict gestational diabetes mellitus by two statistical models including artificial neural ne...
متن کاملOvercoming the uncertainty in a research reactor LOCA in level-1 PSA; Fuzzy based fault-tree/event-tree analysis
Probabilistic safety assessment (PSA) which plays a crucial role in risk evaluation is a quantitative approach intended to demonstrate how a nuclear reactor meets the safety margins as part of the licensing process. Despite PSA merits, some shortcomings associated with the final results exist. Conventional PSA uses crisp values to represent the failure probabilities of basic events. This causes...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013